Prediction of Queue Waiting Times for Metascheduling on Parallel Batch Systems

نویسندگان

  • Rajath Kumar
  • Sathish S. Vadhiyar
چکیده

Prediction of queue waiting times of jobs submitted to production parallel batch systems is important to provide overall estimates to users and can also help meta-schedulers make scheduling decisions. In this work, we have developed a framework for predicting ranges of queue waiting times for jobs by employing multi-class classification of similar jobs in history. Our hierarchical prediction strategy first predicts the point wait time of a job using dynamic k-Nearest Neighbor (kNN) method. It then performs a multi-class classification using Support Vector Machines (SVMs) among all the classes of the jobs. The probabilities given by the SVM for the class predicted using k-NN and its neighboring classes are used to provide a set of ranges of predicted wait times with probabilities. We have used these predictions and probabilities in a metascheduling strategy that distributes jobs to different queues/sites in a multi-queue/grid environment for minimizing wait times of the jobs. Our experiments with different production supercomputer job traces show that our prediction strategies can give correct predictions for about 7787% of the jobs, and also result in about 12% improved accuracy when compared to the next best existing method. Our experiments with our meta-scheduling strategy using different production and synthetic job traces for various system sizes, partitioning schemes and different workloads, show that our meta-scheduling strategy gives much improved performance when compared to existing scheduling policies by reducing the overall average queue waiting times of the jobs by about 47%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying Quick Starters: Towards an Integrated Framework for Efficient Predictions of Queue Waiting Times of Batch Parallel Jobs

Production parallel systems are space-shared and hence employ batch queues in which the jobs submitted to the systems are made to wait before execution. Thus, jobs submitted to parallel batch systems incur queue waiting times in addition to the execution times. Prediction of these queue waiting times is important to provide overall estimates to the users and can also help metaschedulers make sc...

متن کامل

Qespera: an adaptive framework for prediction of queue waiting times in supercomputer systems

Production parallel systems are space-shared, and resource allocation on such systems is usually performed using a batch queue scheduler. Jobs submitted to the batch queue experience a variable delay before the requested resources are granted. Predicting this delay can assist users in planning experiment time-frames and choosing sites with less turnaround times and can also help meta-schedulers...

متن کامل

Grids with multiple batch systems for performance enhancement of multi-component and parameter sweep parallel applications

In this work, we evaluate the benefits of using Grids with multiple batch systems to improve the performance of multi-component and parameter sweep parallel applications by reduction in queue waiting times. Using different job traces of different loads, job distributions and queue waiting times corresponding to three different queuing policies (FCFS, conservative and EASY backfilling), we condu...

متن کامل

Multiple vacation policy for MX/Hk/1 queue with un-reliable server

This paper studies the operating characteristics of an MX/Hk/1 queueing system under multiple vacation policy. It is assumed that the server goes for vacation as soon as the system becomes empty. When he returns from a vacation and there is one or more customers waiting in the queue, he serves these customers until the system becomes empty again, otherwise goes for another vacation. The brea...

متن کامل

On Two-Echelon Multi-Server Queue with Balking and Limited Intermediate Buffer

In this paper we study two echelon multi-server tandom queueing systems where customers arrive according to a poisson process with two different rates. The service rates at both echelons are independent of each other. The service times of customers is assumed to be completed in two stages. The service times at each stage are exponentially distributed. At the first stage, the customers may balk ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014